Automatic Discovery of Fuzzy Synsets from Dictionary Definitions

نویسندگان

  • Hugo Gonçalo Oliveira
  • Paulo Gomes
چکیده

In order to deal with ambiguity in natural language, it is common to organise words, according to their senses, in synsets, which are groups of synonymous words that can be seen as concepts. The manual creation of a broad-coverage synset base is a timeconsuming task, so we take advantage of dictionary definitions for extracting synonymy pairs and clustering for identifying synsets. Since word senses are not discrete, we create fuzzy synsets, where each word has a membership degree. We report on the results of the creation of a fuzzy synset base for Portuguese, from three electronic dictionaries. The resulting resource is larger than existing hancrafted Portuguese thesauri.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

On the Automatic Enrichment of a Portuguese Wordnet with Dictionary Definitions

Besides synsets and semantic relations, synset glosses are an important feature of wordnets. However, due to the required effort, their creation is sometimes left undone. This happens in Onto.PT, a Portuguese wordnet created automatically, which does not have glosses. In our work, we exploited Portuguese dictionaries to automatically assign definitions to the synsets of Onto.PT. For this purpos...

متن کامل

Towards Automatic Evaluation of Wordnet Synsets

Increasing and varied applications of wordnets call for the creation of methods to evaluate their quality. However, no such comprehensive methods to rate and compare wordnets exist. We begin our search for wordnet evaluation strategies by attempting to validate synsets. As synonymy forms the basis of synsets, we present an algorithm based on dictionary definitions to verify that the words prese...

متن کامل

Automatic Induction of Synsets from a Graph of Synonyms

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clus...

متن کامل

Watset: Automatic Induction of Synsets from a Graph of Synonyms

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clus...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011